Page 1 of 7

2023 Journal article Open Access

Efficient and effective tree-based and neural learning to rank
Bruch S., Lucchese C., Nardini F. M.
As information retrieval researchers, we not only develop algorithmic solutions to hard problems, but we also insist on a proper, multifaceted evaluation of ideas. The literature on the fundamental topic of retrieval and ranking, for instance, has a rich history of studying the effectiveness of indexes, retrieval algorithms, and complex machine learning rankers, while at the same time quantifying their computational costs, from creation and training to application and inference. This is evidenced, for example, by more than a decade of research on efficient training and inference of large decision forest models in Learning to Rank (LtR). As we move towards even more complex, deep learning models in a wide range of applications, questions on efficiency have once again resurfaced with renewed urgency. Indeed, efficiency is no longer limited to time and space; instead it has found new, challenging dimensions that stretch to resource-, sample- and energy-efficiency with ramifications for researchers, users, and the environment. This monograph takes a step towards promoting the study of efficiency in the era of neural information retrieval by offering a comprehensive survey of the literature on efficiency and effectiveness in ranking, and to a limited extent, retrieval. This monograph was inspired by the parallels that exist between the challenges in neural network-based ranking solutions and their predecessors, decision forest-based LtR models, as well as the connections between the solutions the literature to date has to offer. We believe that by understanding the fundamentals underpinning these algorithmic and data structure solutions for containing the contentious relationship between efficiency and effectiveness, one can better identify future directions and more efficiently determine the merits of ideas. We also present what we believe to be important research directions in the forefront of efficiency and effectiveness in retrieval and ranking.Source: Foundations and trends in information retrieval 17 (2023): 1–123. doi:10.1561/1500000071
DOI: 10.1561/1500000071
DOI: 10.48550/arxiv.2305.08680
Metrics:

See at: ISTI Repository Open Access | Foundations and Trends® in Information Retrieval Restricted | doi.org | CNR ExploRA

2023 Journal article Open Access

An optimal algorithm for finding champions in tournament graphs
Beretta L., Nardini F. M., Trani R., Venturini R.
A tournament graph is a complete directed graph, which can be used to model a round-robin tournament between n players. In this paper, we address the problem of finding a champion of the tournament, also known as Copeland winner, which is a player that wins the highest number of matches. In detail, we aim to investigate algorithms that find the champion by playing a low number of matches. Solving this problem allows us to speed up several Information Retrieval and Recommender System applications, including question answering, conversational search, etc. Indeed, these applications often search for the champion inducing a round-robin tournament among the players by employing a machine learning model to estimate who wins each pairwise comparison. Our contribution, thus, allows finding the champion by performing a low number of model inferences. We prove that any deterministic or randomized algorithm finding a champion with constant success probability requires ?(ln) comparisons, where l is the number of matches lost by the champion. We then present an asymptotically-optimal deterministic algorithm matching this lower bound without knowing l , and we extend our analysis to three variants of the problem. Lastly, we conduct a comprehensive experimental assessment of the proposed algorithms on a question answering task on public data. Results show that our proposed algorithms speed up the retrieval of the champion up to 13× with respect to the state-of-the-art algorithm that perform the full tournament.Source: IEEE transactions on knowledge and data engineering (Online) 35 (2023): 10197–10209. doi:10.1109/TKDE.2023.3267345
DOI: 10.1109/tkde.2023.3267345
DOI: 10.48550/arxiv.2111.13621
Metrics:

See at: IEEE Transactions on Knowledge and Data Engineering Open Access | ISTI Repository | doi.org Restricted | ieeexplore.ieee.org | CNR ExploRA

2023 Conference article Open Access

ReNeuIR at SIGIR 2023: The Second Workshop on Reaching Efficiency in Neural Information Retrieval
Bruch S., Mackenzie J., Maistro M., Nardini F. M.
Multifaceted, empirical evaluation of algorithmic ideas is one of the central pillars of Information Retrieval (IR) research. The IR community has a rich history of studying the effectiveness of indexes, retrieval algorithms, and complex machine learning rankers and, at the same time, quantifying their computational costs, from creation and training to application and inference. As the community moves towards even more complex deep learning models, questions on efficiency have once again become relevant with renewed urgency. Indeed, efficiency is no longer limited to time and space; instead it has found new, challenging dimensions that stretch to resource-, sample- and energy-efficiency with ramifications for researchers, users, and the environment alike. Examining algorithms and models through the lens of holistic efficiency requires the establishment of standards and principles, from defining relevant concepts, to designing metrics, to creating guidelines for making sense of the significance of new findings. The second iteration of the ReNeuIR workshop aims to bring the community together to debate these questions, with the express purpose of moving towards a common benchmarking framework for efficiency.Source: SIGIR '23: 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3456–3459, Taipei, Taiwan, China, 23-27/07/2023
DOI: 10.1145/3539618.3591922
Metrics:

See at: ISTI Repository Open Access | doi.org Restricted | CNR ExploRA

2023 Conference article Open Access

Post-Hoc selection of pareto-optimal solutions in search and recommendation
Paparella V., Anelli V. W., Nardini F. M., Perego R., Di Noia T.
Information Retrieval (IR) and Recommender Systems (RSs) tasks are moving from computing a ranking of final results based on a single metric to multi-objective problems. Solving these problems leads to a set of Pareto-optimal solutions, known as Pareto frontier, in which no objective can be further improved without hurting the others. In principle, all the points on the Pareto frontier are potential candidates to represent the best model selected with respect to the combination of two, or more, metrics. To our knowledge, there are no well-recognized strategies to decide which point should be selected on the frontier in IR and RSs. In this paper, we propose a novel, post-hoc, theoretically-justified technique, named "Population Distance from Utopia" (PDU), to identify and select the one-best Pareto-optimal solution. PDU considers fine-grained utopia points, and measures how far each point is from its utopia point, allowing to select solutions tailored to user preferences, a novel feature we call "calibration". We compare PDU against state-of-the-art strategies through extensive experiments on tasks from both IR and RS, showing that PDU combined with calibration notably impacts the solution selection.Source: CIKM '23 - 32nd ACM International Conference on Information and Knowledge Management, pp. 2013–2023, Birmingham, UK, 21-25/10/2023
DOI: 10.1145/3583780.3615010
Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2023 Conference article Open Access

Can embeddings analysis explain large language model ranking?
Lucchese C., Minello G., Nardini F. M., Orlando S., Perego R., Veneri A.
Understanding the behavior of deep neural networks for Information Retrieval (IR) is crucial to improve trust in these effective models. Current popular approaches to diagnose the predictions made by deep neural networks are mainly based on: i) the adherence of the retrieval model to some axiomatic property of the IR system, ii) the generation of free-text explanations, or iii) feature importance attributions. In this work, we propose a novel approach that analyzes the changes of document and query embeddings in the latent space and that might explain the inner workings of IR large pre-trained language models. In particular, we focus on predicting query/document relevance, and we characterize the predictions by analyzing the topological arrangement of the embeddings in their latent space and their evolution while passing through the layers of the network. We show that there exists a link between the embedding adjustment and the predicted score, based on how tokens cluster in the embedding space. This novel approach, grounded in the query and document tokens interplay over the latent space, provides a new perspective on neural ranker explanation and a promising strategy for improving the efficiency of the models and Query Performance Prediction (QPP).Source: CIKM '23 - 32nd ACM International Conference on Information and Knowledge Management, pp. 4150–4154, Birmingham, UK, 21-25/10/2023
DOI: 10.1145/3583780.3615225
Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2023 Journal article Open Access

An approximate algorithm for maximum inner product search over streaming sparse vectors
Bruch S., Nardini F. M., Ingber A., Liberty E.
Maximum Inner Product Search or top-k retrieval on sparse vectors is well-understood in information retrieval, with a number of mature algorithms that solve it exactly. However, all existing algorithms are tailored to text and frequency-based similarity measures. To achieve optimal memory footprint and query latency, they rely on the near stationarity of documents and on laws governing natural languages. We consider, instead, a setup in which collections are streaming--necessitating dynamic indexing--and where indexing and retrieval must work with arbitrarily distributed real-valued vectors. As we show, existing algorithms are no longer competitive in this setup, even against na"ive solutions. We investigate this gap and present a novel approximate solution, called Sinnamon, that can efficiently retrieve the top-k results for sparse real valued vectors drawn from arbitrary distributions. Notably, Sinnamon offers levers to trade-off memory consumption, latency, and accuracy, making the algorithm suitable for constrained applications and systems. We give theoretical results on the error introduced by the approximate nature of the algorithm, and present an empirical evaluation of its performance on two hardware platforms and synthetic and real-valued datasets. We conclude by laying out concrete directions for future research on this general top-k retrieval problem over sparse vectors.Source: ACM transactions on information systems (2023). doi:10.1145/3609797
DOI: 10.1145/3609797
Metrics:

See at: dl.acm.org Open Access | ISTI Repository | ACM Transactions on Information Systems Restricted | CNR ExploRA

2023 Contribution to conference Open Access

String processing and information retrieval - SPIRE 2023 - Proceedings of the 30th International Symposium, Pisa, Italy, September 26-28, 2023
Nardini F. M., Pisanti N., Venturini R.
The 30th International Symposium on String Processing and Information Retrieval (SPIRE) was held on September 26-28, 2023, in Pisa (Italy), followed by the 18th Workshop on Compression, Text, and Algorithms (WCTA) held on September 29, 2023. SPIRE started in 1993 as the South American Workshop on String Processing. It was held in Latin America until 2000. Then, SPIRE moved to Europe, and from then on, it has been held in Australia, Japan, the UK, Spain, Italy, Finland, Portugal, Israel, Brazil, Chile, Colombia, Mexico, Argentina, Bolivia, Peru, the USA, and France. SPIRE continues the long and well-established tradition of encouraging high-quality research at the broad nexus of string processing, information retrieval, and computational biology. This volume contains the accepted papers presented at SPIRE 2023. SPIRE 2023 received a total of 47 submissions. Each submission received at least three reviews. After an intensive discussion phase, the Scientific Program Committee accepted 31 papers. We thank all the authors for their valuable contributions and presentations at the conference and the Program Committee members for their valuable work during the review and discussion phases. We thank Springer for publishing the proceedings of SPIRE 2023 in the LNCS series and ACM SIGIR for sponsoring the conference. The scientific program of SPIRE 2023 includes invited talks by three eminent researchers in the field: Sebastian Bruch (Pinecone, USA), Inge Li Gørtz (Technical University of Denmark, Denmark), and Jakub Radoszewski (University of Warsaw, Poland). SPIRE 2023 had a Best Paper Award, sponsored by Springer. The award was announced during the conference. Finally, we thank the Local Organizing Committee members for making the conference successful.Source: New York: Springer, 2023
DOI: 10.1007/978-3-031-43980-3
Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2023 Contribution to conference Open Access

Proceedings of the 13th Italian Information Retrieval Workshop (IIR 2023), Pisa, Italy, June 8-9, 2023
Nardini F. M., Tonellotto N., Faggioli G., Ferrara A.
There were 33 papers submitted this workshop. Out of these, 24 were accepted for this volume, 1 as regular papers and 23 as short papers.Source: Aachen: CEUR-WS.org, 2023

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2023 Conference article Open Access

Commonsense injection in conversational systems: an adaptable framework for query expansion
Rocchietti G., Frieder O., Muntean C. I., Nardini F. M., Perego R.
Recent advancements in conversational agents are leading a paradigm shift in how people search for their information needs, from text queries to entire spoken conversations. This paradigm shift poses a new challenge: a single question may lack the context driven by the entire conversation. We propose and evaluate a framework to deal with multi-turn conversations with the injection of commonsense knowledge. Specifically, we propose a novel approach for conversational search that uses pre-trained large language models and commonsense knowledge bases to enrich queries with relevant concepts. Our framework comprises a generator of candidate concepts related to the context of the conversation and a selector for deciding which candidate concept to add to the current utterance to improve retrieval effectiveness. We use the TREC CAsT datasets and ConceptNet to show that our framework improves retrieval performance by up to 82% in terms of Recall@200 and up to 154% in terms of NDCG@3 as compared to the performance achieved by the original utterances in the conversations.Source: IEEE/WAT - 22nd International Conference on Web Intelligence and Intelligent Agent Technology, pp. 48–55, Venezia, Italy, 26-29/10/2023
DOI: 10.1109/wi-iat59888.2023.00013
Metrics:

See at: ISTI Repository Open Access | ieeexplore.ieee.org Restricted | CNR ExploRA

2023 Conference article Open Access

Rewriting conversational utterances with instructed large language models
Galimzhanova E., Muntean C. I., Nardini F. M., Perego R., Rocchietti G.
Many recent studies have shown the ability of large language models (LLMs) to achieve state-of-the-art performance on many NLP tasks, such as question answering, text summarization, coding, and translation. In some cases, the results provided by LLMs are on par with those of human experts. These models' most disruptive innovation is their ability to perform tasks via zero-shot or few-shot prompting. This capability has been successfully exploited to train instructed LLMs, where reinforcement learning with human feedback is used to guide the model to follow the user's requests directly. In this paper, we investigate the ability of instructed LLMs to improve conversational search effectiveness by rewriting user questions in a conversational setting. We study which prompts provide the most informative rewritten utterances that lead to the best retrieval performance. Reproducible experiments are conducted on publicly-available TREC CAST datasets. The results show that rewriting conversational utterances with instructed LLMs achieves significant improvements of up to 25.2% in MRR, 31.7% in Precision@1, 27% in NDCG@3, and 11.5% in Recall@500 over state-of-the-art techniques.Source: IEEE/WAT - 22nd International Conference on Web Intelligence and Intelligent Agent Technology, pp. 56–63, Venezia, Italy, 26-29/10/2023
DOI: 10.1109/wi-iat59888.2023.00014
Metrics:

See at: ISTI Repository Open Access | ieeexplore.ieee.org Restricted | CNR ExploRA

2023 Journal article Open Access

Early exit strategies for learning-to-rank cascades
Busolin F., Lucchese C., Nardini F. M., Orlando S., Perego R., Trani S.
The ranking pipelines of modern search platforms commonly exploit complex machine-learned models and have a significant impact on the query response time. In this paper, we discuss several techniques to speed up the document scoring process based on large ensembles of decision trees without hindering ranking quality. Specifically, we study the problem of document early exit within the framework of a cascading ranker made of three components: 1) an efficient but sub-optimal ranking stage; 2) a pruner that exploits signals from the previous component to force the early exit of documents classified as not relevant; and 3) a final high-quality component aimed at finely ranking the documents that survived the previous phase. To maximize speedup and preserve effectiveness, we aim to increase the accuracy of the pruner in identifying non-relevant documents without early exiting documents that are likely to be ranked among the final top-k results. We propose an in-depth study of heuristic and machine-learning techniques for designing the pruner. While the heuristic technique only exploits the score/ranking information supplied by the first sub-optimal ranker, the machine-learned solution named LEAR uses these signals as additional features along with those representing query-document pairs. Moreover, we study alternative solutions to implement the first ranker, either a small prefix of the original forest or an auxiliary machine-learned ranker explicitly trained for this purpose. We evaluated our techniques through reproducible experiments using publicly available datasets and state-of-the-art competitors. The experiments confirm that our early-exit strategies achieve speedups ranging from 3× to 10× without statistically significant differences in effectiveness.Source: IEEE access 11 (2023): 126691–126704. doi:10.1109/ACCESS.2023.3331088
DOI: 10.1109/access.2023.3331088
Metrics:

See at: CNR ExploRA

2023 Contribution to book Open Access

PerconAI 2023: 2nd Workshop on Pervasive and Resource-Constrained Artificial Intelligence - Welcome from General Chairs
Angelov P., Bernardi M. L., Nardini F. M., Pecori R., Valerio L.
The PeRConAI workshop aims at promoting the circulation of new ideas and research directions on pervasive and resource-constrained artificial intelligence, serving as a forum for practitioners and researchers working on the intersection between pervasive computing and machine learning, including deep learning and artificial intelligence. The workshop welcomes theoretical and applied research sharing the common objective of investigating solutions advancing towards a truly pervasive and liquid AI. In the long term, we envision a future where every device at the edge of the Internet, regardless of its computing capabilities, will have an active role in the AI process by processing its data and collaborating with other devices to extract knowledge from them. The workshop international program committee refereed the submitted full papers by their quality and relevance to the theme. The final program includes four carefully selected papers out of ten submissions. The workshop includes a keynote and two technical sessions covering the two main aspects of AI in pervasive computing. The first session covers distributed and federated learning aspects in pervasive contexts and inference on embedded devices. The papers in the second session address problems related to real-time privacy preservation and resource allocation for vision-related problems at the edge. The keynote speech, titled "Decentralized Intelligence: towards collaborative and sustainable learning", will take place after the opening remarks, and it will be delivered by Dr. Paolo Dini (CTTC, Spain). We want to thank the program committee members for providing detailed and thoughtful reviews, thus making it possible to assemble an exciting program of high-quality papers. We would also like to thank the PerCom organizers, particularly Prof. Carlo Vallati and Prof. Qi Han, the PerCom workshop cochairs, for supporting the workshop and assisting with its organization. Finally, we thank all authors and workshop attendees for their contributions and participation.Source: , pp. 105–107, 2023

See at: ISTI Repository Open Access | ieeexplore.ieee.org Restricted | CNR ExploRA

2022 Journal article Open Access

Dynamic hard pruning of Neural Networks at the edge of the internet
Valerio L., Nardini F. M., Passarella A., Perego R.
Neural Networks (NN), although successfully applied to several Artificial Intelligence tasks, are often unnecessarily over-parametrized. In edge/fog computing, this might make their training prohibitive on resource-constrained devices, contrasting with the current trend of decentralizing intelligence from remote data centres to local constrained devices. Therefore, we investigate the problem of training effective NN models on constrained devices having a fixed, potentially small, memory budget. We target techniques that are both resource-efficient and performance effective while enabling significant network compression. Our Dynamic Hard Pruning (DynHP) technique incrementally prunes the network during training, identifying neurons that marginally contribute to the model accuracy. DynHP enables a tunable size reduction of the final neural network and reduces the NN memory occupancy during training. Freed memory is reused by a dynamic batch sizing approach to counterbalance the accuracy degradation caused by the hard pruning strategy, improving its convergence and effectiveness. We assess the performance of DynHP through reproducible experiments on three public datasets, comparing them against reference competitors. Results show that DynHP compresses a NN up to 10 times without significant performance drops (up to 3.5% additional error w.r.t. the competitors), reducing up to 80% the training memory occupancy.Source: Journal of network and computer applications 200 (2022). doi:10.1016/j.jnca.2021.103330
DOI: 10.1016/j.jnca.2021.103330
Metrics:

See at: ISTI Repository Open Access | www.sciencedirect.com Restricted | CNR ExploRA

2022 Journal article Open Access

Distilled neural networks for efficient learning to rank
Nardini F. M., Rulli C., Trani S., Venturini R.
Recent studies in Learning to Rank have shown the possibility to effectively distill a neural network from an ensemble of regression trees. This result leads neural networks to become a natural competitor of tree-based ensembles on the ranking task. Nevertheless, ensembles of regression trees outperform neural models both in terms of efficiency and effectiveness, particularly when scoring on CPU. In this paper, we propose an approach for speeding up neural scoring time by applying a combination of Distillation, Pruning and Fast Matrix multiplication. We employ knowledge distillation to learn shallow neural networks from an ensemble of regression trees. Then, we exploit an efficiency-oriented pruning technique that performs a sparsification of the most computationally-intensive layers of the neural network that is then scored with optimized sparse matrix multiplication. Moreover, by studying both dense and sparse high performance matrix multiplication, we develop a scoring time prediction model which helps in devising neural network architectures that match the desired efficiency requirements. Comprehensive experiments on two public learning-to-rank datasets show that neural networks produced with our novel approach are competitive at any point of the effectiveness-efficiency trade-off when compared with tree-based ensembles, providing up to 4x scoring time speed-up without affecting the ranking quality.Source: IEEE transactions on knowledge and data engineering (Online) 35 (2022): 4695–4712. doi:10.1109/TKDE.2022.3152585
DOI: 10.1109/tkde.2022.3152585
Metrics:

See at: ISTI Repository Open Access | ieeexplore.ieee.org Restricted | CNR ExploRA

2022 Conference article Closed Access

Ensemble model compression for fast and energy-efficient ranking on FPGAs
Gil-Costa V., Loor F., Molina R., Nardini F. M., Perego R., Trani S.
We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine-learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks, we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on low-cost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9 × up to 19.8 × less energy than an equivalent multi-threaded CPU implementation.Source: ECIR 2022 - 44th European Conference on IR Research, pp. 260–273, Stavanger, Norway, 10-14/04/2022
DOI: 10.1007/978-3-030-99736-6_18
Metrics:

See at: doi.org Restricted | link.springer.com | CNR ExploRA

2022 Conference article Open Access

The Istella22 dataset: bridging traditional and neural learning to rank evaluation
Dato D., Macavaney S., Nardini F. M., Perego R., Tonellotto N.
Neural approaches that use pre-trained language models are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their effectiveness compared to feature-based Learning-to-Rank (LtR) methods has not yet been well-established. A major reason for this is because present LtR benchmarks that contain query-document feature vectors do not contain the raw query and document text needed for neural models. On the other hand, the benchmarks often used for evaluating neural models, e.g., MS MARCO, TREC Robust, etc., provide text but do not provide query-document feature vectors. In this paper, we present Istella22, a new dataset that enables such comparisons by providing both query/document text and strong query-document feature vectors used by an industrial search engine. The dataset consists of a comprehensive corpus of 8.4M web documents, a collection of query-document pairs including 220 hand-crafted features, relevance judgments on a 5-graded scale, and a set of 2,198 textual queries used for testing purposes. Istella22 enables a fair evaluation of traditional learning-to-rank and transfer ranking techniques on the same data. LtR models exploit the feature-based representations of training samples while pre-trained transformer-based neural rankers can be evaluated on the corresponding textual content of queries and documents. Through preliminary experiments on Istella22, we find that neural re-ranking approaches lag behind LtR models in terms of effectiveness. However, LtR models identify the scores from neural models as strong signals.Source: SIGIR '22 - 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3099–3107, Madrid, Spain, 11-15/07/2022
DOI: 10.1145/3477495.3531740
Metrics:

See at: ISTI Repository Open Access | dl.acm.org Restricted | CNR ExploRA

2022 Conference article Open Access

ILMART: interpretable ranking with constrained LambdaMART
Lucchese C., Nardini F. M., Orlando S., Perego R., Veneri A.
Interpretable Learning to Rank (LtR) is an emerging field within the research area of explainable AI, aiming at developing intelligible and accurate predictive models. While most of the previous research efforts focus on creating post-hoc explanations, in this paper we investigate how to train effective and intrinsically-interpretable ranking models. Developing these models is particularly challenging and it also requires finding a trade-off between ranking quality and model complexity. State-of-the-art rankers, made of either large ensembles of trees or several neural layers, exploit in fact an unlimited number of feature interactions making them black boxes. Previous approaches on intrinsically-interpretable ranking models address this issue by avoiding interactions between features thus paying a significant performance drop with respect to full-complexity models. Conversely, ILMART, our novel and interpretable LtR solution based on LambdaMART, is able to train effective and intelligible models by exploiting a limited and controlled number of pairwise feature interactions. Exhaustive and reproducible experiments conducted on three publicly-available LtR datasets show that ILMART outperforms the current state-of-the-art solution for interpretable ranking of a large margin with a gain of nDCG of up to 8%.Source: SIGIR '22 - 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2255–2259, Madrid, Spain, 11-15/07/2022
DOI: 10.1145/3477495.3531840
Metrics:

See at: ISTI Repository Open Access | dl.acm.org Restricted | CNR ExploRA

2022 Contribution to conference Open Access

Energy-efficient ranking on FPGAs through ensemble model compression (Abstract)
Gil-Costa V., Loor F., Molina R., Nardini F. M., Perego R., Trani S.
In this talk, we present the main results of a paper accepted at ECIR 2022 [1]. We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks [2], we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on lowcost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9× up to 19.8× less energy than an equivalent multi-threaded CPU implementation.Source: IIR 2022 - 12th Italian Information Retrieval Workshop 2022, Tirrenia, Pisa, Italy, 19-22/06/2022

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2022 Conference article Open Access

ReNeuIR: Reaching Efficiency in Neural Information Retrieval
Bruch S., Lucchese C., Nardini F. M.
Perhaps the applied nature of information retrieval research goes some way to explain the community's rich history of evaluating machine learning models holistically, understanding that efficacy matters but so does the computational cost incurred to achieve it. This is evidenced, for example, by more than a decade of research on efficient training and inference of large decision forest models in learning-to-rank. As the community adopts even more complex, neural network-based models in a wide range of applications, questions on efficiency have once again become relevant. We propose this workshop as a forum for a critical discussion of efficiency in the era of neural information retrieval, to encourage debate on the current state and future directions of research in this space, and to promote more sustainable research by identifying best practices in the development and evaluation of neural models for information retrieval.Source: SIGIR 2022 - The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3462–3465, Madrid, Spain, 11-15/07/2022
DOI: 10.1145/3477495.3531704
Metrics:

See at: ISTI Repository Open Access | dl.acm.org Restricted | doi.org | CNR ExploRA

2022 Contribution to conference Open Access

Interpretable ranking using LambdaMART (Abstract)
Lucchese C., Nardini F. M., Orlando S., Perego R., Veneri A.
Source: IIR 2022 - 12th Italian Information Retrieval Workshop 2022, Milano, Italy, 29-30/06/2022

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA